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Abstract 

In this paper we study the concurrent cops and robber (CCCR) game. CCCR follows the same rules as 
the classical, turn-based game, except for the fact that the players move simultaneously. The cops’ goal is to 
capture the robber and the concurrent cop number of a graph is defined the minimum number of cops which 
guarantees capture. For the variant in which it it required to capture the robber in the shortest possible time, 
we let time to capture be the payoff function of CCCR; the (game theoretic) value of CCCR is the optimal 
capture time and (cop and robber) time optimal strategies are the ones which achieve the value. In this paper 
we prove the following. 

1. For every graph G, the concurrent cop number is equal to the “classical” cop number. 

2. For every graph G, CCCR has a value, the cops have an optimal strategy and, for every e > 0, the 
robber has an e-optimal strategy. 


1 Introduction 

In this paper we study the concurrent cops and robber (CCCR) game. In the classical CR game [171 B each 
player observes the other player’s move before he performs his own. On the other hand, in concurrent CR the 
players move simultaneously. In all other aspects, the concurrent game (henceforth CCCR) follows the same 
rules as the classical, turn-based game (henceforth TBCR). 

The CCCR game (similarly to TBCR) can be considered as either a game of kind (the cops’ goal is to capture 
the robber) or a game of degree (the cops’ goal is to capture the robber in the shortest possible timeR 

This paper is organized as follows. In Section [7] we define preliminary concepts and notation and use these 
to define the CCCR game rigorously. In Section [3] we concentrate on the “game of kind” aspect: we define the 
concurrent cop number c(G) and prove that, for every graph G, it is equal to the “classical” cop number c{G). 
In Section [4] we concentrate on the “game of degree” aspect: we equip CCCR with a payoff function (namely 
the time required to capture the robber) and prove that (a) CCCR has a game theoretic value, (b) the cops 
have an optimal strategy and (c) for every e > 0 the robber has an e-optimal strategy; in addition we provide 
an algorithm for the computation of the value and the optimal strategies. In Section [5] we discuss related work. 
Finally, in Section [6] we present our conclusions and future research directions. 


2 Preliminaries 

In this section, as well as in the rest of the paper, we will mainly concern ourselves with the case of a single cop; 
this is reflected in the following definitions and notation. In case K > 1 cops are considered, this will be stated 
explicitly; the extension of definitions and notation is straightforward. 

2.1 Definition of the CCCR Game 

Both CCCR and TBCR are played on an undirected, simple and connected graph G = ( V., E ) by two players 
called C and R. Player C, controlling K cops (with K > 1) pursues a single robber controlled by player R (we 
will sometimes call both the cops and robber tokens). We assume the reader is familiar with the rules of TBCR 
and proceed to present the rules of CCCR for the case of K = 1 (a single cop). 

1 This terminology is due to Isaacs [9]. 
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1. The game starts from given initial positions: the cop is located at x’o G V and the robber at yo G V. 

2. At the i-th round (t G N) C moves the cop to x t G N [x t _i] and simultaneously R moves the robber to 

y t &N [j/t-iJE 

3. At every round both players know the current cop and robber location (and remember all past locations). 

4. A capture occurs at the smallest t G N for which either of the following conditions holds: 

(a) The cop is located at xt, the robber is located at yt, and Xt = yt- This capture condition is the same 
as in TBCR. 

(b) The cop is located at xt_i and moves to yt- 1 , while the robber is located at yt -1 and moves to x*_i. 
We will call this a en passant ” capture; it does not have an analog in TBCR. 

5. C wins if capture takes place for some i G N. Otherwise, R wins. The game analysis becomes easier if 
we assume that the game always lasts an infinite number of rounds; if a capture occurs at t c , then we will 
have Xt = yt = Xt c for all t >t c . 

We will denote the above defined game, played on graph G = {V,E) and starting from initial position 
(x, y) G V 2 by T^, y y In case the game is played with K cops, it will be denoted by T^.’^ (in this case x G V K ). 

2.2 Nomenclature and Notation 

The following quantities will be used in the subsequent analysis (once again, we present definitions for the case 
of K = 1). Some of them require two separate definitions: one for TBCR and another for CCCR. 

Definition 2.1 A position in TBCR is a triple (x,y,P) where x G V is the cop location, y G V is the robber 
location and P G {C,R} is the player whose turn it is to move. We also have |V| + 1 additional positions: 

1. the position (0, 0, C) corresponds to the beginning of the game, before either player has placed his token; 

2. the positions (x, 0, R), x £ V, correspond to the phase of the game in which C has placed the cop but R has 
not placed the robber. 

The set of all TBCR positions is denoted by S = V x V x {C, R}. 

Definition 2.2 A position in CCCR is a pair (x,y) where x G V is the cop location and y G V is the robber 
location. The set of all CCCR positions is denoted by S = V x V. 

Definition 2.3 A history is a position sequence of finite or infinite length. The set of all game histories of any 
finite length is denoted by S* for TBCR and S* for CCCR. The set of all infinite game histories is denoted by 
S°° for TBCR and S°° for CCCR. 

In both TBCR and CCCR, the players’ moves are graph nodes, e.g., x,y G V. Given the next move (in 
TBCR) or moves (in CCCR) the next game position is determined by the transition function, which encodes the 
rules of the respective game. 

Definition 2.4 In TBCR, the transition function Q : S x V —>• S is defined as follows: 
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2 N [?/] denotes the closed neighborhood of node u, i.e., the set containing u itself and all nodes connected to u by an edge. 
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Definition 2.5 In CCCR, the transition function Q : S xV xV —> S is defined as follows: 


when x = y : 

when x ^ y and x' G N [x] and y' £ N [y\ 
if x' = y and y' = x : 
otherwise : 

when x y and x' N [x] and y' £ N [y\ 

when x y and x' G N [x] and y' N [y\ 

when x y and x' N [x] and y' ft N [y] 
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Q((x,y) ,x',y’) 


(x,x) 

(x',x') 
(x',y') 
C x,y ') 

( x',y) 

( x,y) 


The above rules have the following consequences (which will facilitate our subsequent analysis). 


1. The CR game continues for an infinite number of rounds; but if a capture occurs at some time t Cl the cop 
and robber locations remain fixed for all subsequent times. 

2. The transition function accepts “illegal” moves (e.g., x' £ N [x]) as input but “ignores” them, in the sense 
that they have no influence on the location of tokens. 


Roughly speaking, a strategy is a rule which, given a game history, prescribes a player’s next move. In CCCR 
the players gain an advantage by using randomized or mixed strategies. 

Definition 2.6 A randomized or mixed strategy is a function ft : S* x V —>• [0,1], which satisfies 

v ((x 0 ,yo) ,{xi,yi), ...,{x t ,y t )) G S* : (z\(x 0 ,yo) ,{xi,yi) ,...,(x t ,y t )) = 1 

lev 

and gives the probability that at time t the player moves into node z, given that the game has started at position 
(Xo, 2/o) an d progressed through positions (xi, y \),..., (xt, yt )■ 

Two classes of strategies will be of special interest to us. 

Definition 2.7 A strategy tt is called memoryless iff 

V ((5?o,2/o) j (xtiVt)) G S*,Vz G V :fr(z\ (x 0l y 0 ),(x t ,y t )) =tt{z\ ( x t ,y t )) , 

i.e., the player’s move depends only on the current game position. 

Definition 2.8 A strategy It is called deterministic iff 

V((x 0 ,y 0 ),-,(xt,yt)) £S:3 z:tt(z\ (x 0 ,y 0 ),..., (x t ,y t )) = 1 , 

i.e., for every game history (xq, yo ),..., (xt, yt), there is a position z to which the player will move with certainty. 

If tt is deterministic, it can be equivalently described by a function d : S* —> V which is determined by ft as 
follows: 

°({xo,yo),-, ( x t ,y t )) = ? iff ft (z\ (xo, yo) , ( x t ,y t )) = 1 . 

Similarly, if ft is memoryless and deterministic, it can be equivalently described by a function a : S —> V which 
is determined by ft as follows: 

? (x t ,y t ) = z iff ft (2) ( x t ,y t )) = 1- 

The above definitions and remarks concern CCCR strategies. Regarding TBCR strategies, it is well known 
[8] that both players lose nothing by restricting themselves to memoryless deterministic strategies of the form 
a : S —> V. In other words, if player P uses the strategy a and the current game position is (x, y , P) (which 
means that it is P’s turn to move) P moves his token into node cr(x,y, P). Obviously P will only use a when 
it is his turn to move; hence we can use the notation ac (x, y) when talking about cop strategies and an (x, y) 
when talking about robber strategies. Note that a cop strategy is also defined for the initial position (0, 0, C) and 
a robber strategy is also defined for the initial positions (x,0,P) (for every x G V). 
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3 Cop Numbers 

In the “classical” TBCR game we have the following. 

Definition 3.1 The cop number c(G) of a graph G is the minimum number of cops sufficient to capture the 
robber when TBCR is played (optimally by both players) on G. 

Note that in the above definition optimal play includes optimal initial placement (in the 0-th round) of the 

G K 

cops and robber on G. On the other hand, in the CCCR game F^ ^ the initial cops and robber positions are 
given (rather than chosen by the players). A reasonable definition of cop number should account for all possible 
initial positions. Hence we have the following. 

Definition 3.2 The concurrent cop number c(G) of graph G is the minimum number of cops sufficient to ensure 
capture with probability one for every initial position (xo,yo), when CCCR is played (optimally by both players) 
on G. 

Note also the expression “capture with probability one” in Definition 13.21 This is different from “certain 
capture” in the sense that there may exist infinite game histories in which capture does not occur, but the 
probability of any such infinite history materializing is zercjfj. 

In what follows, whenever we mention an arbitrary robber (or cop) move sequence yo, yi,2/2, we assume 
that it is a legal move sequence, i.e., for all t we have y t + 1 G N[y t ], Also, if capture occurs at time t c , the 
robber’s (and cop’s) location remains fixed at yt = yt c , irrespective of the moves yt c +i, Ut c + 2 , ■■■ ■ 

Lemma 3.3 c (G) = 1 =>■ c (G) = 1. 

Proof. We select an arbitrary graph G with c (G) = 1 and fix it for the rest of the proof. Both TBCR and 
CCCR will be played on this G. We let n = \V\, i.e., n is the number of nodes of G. 

We will prove the proposition by constructing a (deterministic and memoryless) cop strategy ttq which 
guarantees, for every starting position, CCCR capture with probability 1. 

An essential component of irf) is a deterministic cop strategy af, constructed from another deterministic, 
memoryless cop strategy af which guarantees capture in the TBCR game. Since c (G) = 1 we know [8] that 
such a af exists and guarantees capture in at most T rounds, where T depends only on G. Furthermore recall 
that we have defined TBCR so that after capture takes place both C and R stay in place. The rest of the proof 
will be divided in two parts. 

Part 1 . Consider the CCCR game and assume that, for every time f, G knows R’s next move (this assumption 
will be removed in Part 2). Take an arbitrary starting position so = (xo,2/o) and suppose that at time t , when 
the position is (x G (knowing that R’s next move will be yt+i) plays Xt +i = af (xt,yt+ 1 ) = erf (x t ,y t+ 1)0 
Then, for any robber moves yi, y 2 , in rounds t = 1,2,... the sequence of game positions will be: 

(x 0 , 2/o), (xi = (Tc (xo, yi) , 2/1), (X 2 = erf; (£ 1 , 2 / 2 ), 2 / 2 ), (x t = (T*c (x t -i,y t ) , yt), ■■■ 

We will prove that xt = Vt, i.e., capture results in at most T rounds and this will be true with certainty for any 
starting position so = (£ 0 , 2 / 0 ) and robber moves y\,y 2 , ■■■ thereafter. 

To show this, consider a TBCR game in which, at the end of the initial round (t = 0) the position is 
(xo,yo,C) = (x 0 ,yi,G). Further suppose that G uses erf and R plays the moves y\,y 2 ,... with y t = yt+i (for 
t = 0,1,...). Note that, given y 0 = yi, and also that yi,y 2 , ... are legal robber moves in CCCR, the resulting 
robber moves yi,y 2 , ...in TBCR are also legal. Moreover recall that, for any given starting position (xo, yo), when 
the robber moves are yi,y 2 ,... and G uses af, we get a sequence of cop and robber locations of the following 
form: 

£o> 2/o, X\ = af (xo, 2/o) ,2/i, £2 = erf (xi, 2 /i) ,y 2 , -,£t = erf (x t _i, y t _i), y t ,... 

3 This point is further discussed in Section [6] 

4 Note that irj, is deterministic and only uses two inputs: one is xt (from the previous round) ands the other is yt+i (from the 
current round). Hence g* c is memoryless in the sense that it only requires knowledge of the immediate past position, but it is also 
prescient in the sense that it requires knowledge of the current robber move. 


4 




Given y t = yt+i for t = 0,1,..., Xq = Xq and that C uses a* c , we get x\ = crJ(a;o,j/o) = cr^,(xo,yi) = £ 1 , 
£2 = cr^(xi,j/i) = <Tc(xi,y 2 ) = £ 2 , x t = <J* c (x t -i,yt-i) = a* c (x t -i,y t ) = x t , ... . Thus the resulting 
sequence of cop and robber locations in TBCR is: 


£ 0 , 2 / 1 , £1,2/2,22,2/3, -,£t, 2 /t+i, 


(1) 


Since a* c guarantees capture by time T in TBCR, we have Xt = lI t, irrespective of the moves 2/i, 2 / 2 , - - ■ ■ In 
fact we will have Xt = 2/t-i, i.e., at most in the first (i.e. cop) phase of round T, C captures R , or else (i.e., if 
£t 7^ 2 /T-i) R can stay put in this round and then Xt ^ lJ t, which is a contradiction. Since xt = £t and yr-i 
= yx from xt = 2/r-i we have xt = Vt- 

We conclude that also in the CCCR game, for any starting position s 0 = (£o,2/o) and subsequent robber 
moves yi, ^ 2 , ■■■, , capture takes place by the T-tli round at the latest. We repeat that this holds under the 
assumption that: in each round t, C knows R’s next move 2/t+i- 

Part 2 . In the actual CCCR game C will not know R's next move 2 /t+i; however he can always guess yt+i to 
be v. Suppose that, when R is at yt, C guesses with uniform probability that R will move to v G N [j/ t ]. 

Let yt+i be R’s actual move at t + 1 and yt+i be C’s guess of that move. We have 

Pr ( 2 / 4+1 = v\y t +i =v)= > - 

W]yt\\ n 

and 

Pr (C guesses R’s move correctly) 

= Pr (yt +1 = 2/t+i) = Y] p r (yt+i = v\y t+1 = v) Pr (y t+1 = v) > - V Pr (y t+1 =v) = -. 

~ n “L, n 

T [vt] v€N[y t ] 

In other words, C guesses correctly R’s next move with probability at least It follows that C guesses correctly 
R’s next T moves (and captures R) with probability at least (d-) T - 
Now we define the following set of CCCR infinite game histories: 

V/cSN:Afe = |s:sS S°° and R is still free after the first k ■ T rounds |, 

A = lim sup Ak = l~C =1 U£L m A k . 


Since A k+ 1 C A k (for all k £ N) we have 


A = C“ =1 U^l m A k = n™ =1 A m = |s : s G S°° and Mm G N : R is still free after the first m ■ T rounds j . 
In other words, A is the set of all CCCR infinite game histories in which R is never captured. Since 


!>(*)<£ 1- - 


fc=1 


fc=1 


< 00 


we have (from the first Borel-Cantelli lemma [5]) that Pr ( A) = 0. 

To sum up: using the deterministic memoryless strategy erj in conjunction with uniform guessing, we obtain 
a randomized memoryless strategy tt# which guarantees capture with probability one in the CCCR game; thus 
c(G) = 1 as claimed. ■ 

In case of a graph G with c(G) = K, it is straightforward to extend the previous argument using K cops. 
Note however that, at the end of the proof we will know that: if K cops are required to capture the robber in 
TBCR, then K cops suffice to capture (with probability one) the robber in CCCR. Hence we have the following. 


Lemma 3.4 c (G) = K =>- c (G) < I\ . 

Next we show the “reverse” of Lemma O 


Lemma 3.5 c(G) = 1 => c(G) = 1. 
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Proof. We select an arbitrary graph G with c (G) = 1 and fix it for the rest of the proof. Both TBCR and 
CCCR will be played on this G. The Lemma can be stated equivalently as: 

c(G) > 1 =>c(G) > 1 

and this is what we will prove. 

If c (G) > 1 then there exists a (memoryless and deterministic) winning robber strategy a R for TBCR with 
one cop on G. More specifically, a R guarantees that, for every cop starting position Xq , the robber will never be 
captured. 

Choose any xq £ V and let yo = cr R (xo,0). Using a Rl we will construct a CCCR robber strategy d R such 

that: when CCCR (played on G with a single cop) starts from position (xq. yo) and R uses a* R . the capture 

probability is zero. This, clearly, implies that c(G) > 1. 

It suffices to define a R only for the case when CCCR starts from (xo,yo), as follows. 

1. In round t = 1: yi = yo (R stays put); 

2. In rounds t = 2,3,..., R plays according to cr R . In other words, if Xt-i = u and yt-i = v , then 

Vt =v* R (u,v) = <J* R (u,v). 

Clearly, a R is not strictly memoryless. The move yi = yo depends not only on the game position (xq, yo) but 
also on the fact that this is the first round. However, the part of a R used in rounds t > 2 is memoryless. 

Suppose that in CCCR (starting from (xo,yo)) R plays the strategy d R while G plays any move sequence 
Xi,X 2 , ... ■ To prove that capture will never occur, consider a TBCR in which R plays the strategy a R and G 
plays the same move sequence Xq, x±, ... as in CCCR. Since <r* R is winning, capture will never take place in TBCR; 
as will be shown, this implies capture will never occur in CCCR either and, since this holds for any £ 1 , 2 : 2 ,..., 
we will conclude that c(G) > 1. 

Let yo, y 1 , 2 / 2 • -• be the robber moves occurring in TBCR, given that R plays a R and G plays xo, £ 1 , Let us 
use d ( u , v) to denote the distance of nodes u, v in G, i.e., the length of shortest path between u and v. Obviously 
we have 

Vt > 0 : d(x t +i,yt) > 1 (2) 

(if we had d(xt+i,yt) = 0 then a R would not be a winning strategy). Furthermore 

Vt > 0 : y t +i = y t . (3) 

Indeed, yi = yo by construction and if, for some n, we have y n = y n - 1 , then 

Vn+l — yn) = ^B^ni Vn— l) = yn- 

From m and J2]) follows that 

Vt : 1 < d(x t ,y t )- (4) 

This almost completes the proof that capture never occurs in CCCR. However, we must also consider the 
possibility of an “en passant” capture, i.e., the case x t +\ = yt and yt+i = £*■ But this would mean 

d ( x t ,y t ) = d (x t , y t +i) = d(x t ,x t ) = 0; 

in other words, we would have capture in TBCR which contradicts the assumption that a R is a winning robber 
strategy. Hence “en passant” capture is also impossible in CCCR. The proof is complete. ■ 

It is straightforward to extend the above for the case of c(G) = K and obtain the following. 

Lemma 3.6 c(G) = K =$■ c(G) < K. 

Now we can prove our main result. 

Theorem 3.7 c(G) = K <&c(G) = K. 

Proof. Assume that c (G) = K. By Lemma [3.41 we have c(G) = K =>■ c(G) < I\; if c(G) = K' < K , then by 
Lemma 1717711 we have c(G) < K' < K = c(G), which is a contradiction. Thus c(G) = K => c(G) = K. 

Conversely, assume that c(G') = K. By Lemma [3761 we have c(G) = K => c(G') < K; if c(G) = K' < K, then 
by Lemma ITTTTI we have c(G) = K' < K = c(G), which is a contradiction. Thus c(G) = K => c(G) = K. ■ 
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4 Time Optimality 


4.1 Existence of Value and Optimal Strategies 

Recall that r^, Q y ^ denotes the CCCR game played on graph G by a single cop starting at location x 0 and a 
single robber starting at location yo- We equip y ^ with a payoff function, defined as follows. First define 
the auxiliary function 

\/(x,y)GV 2 :r(x,y) = { J ^ t\ 

where x and y are cop and robber locations, respectively. Suppose that for every round of r? ^ in which 
the robber remains uncaptured, C pays R one unit of utility and denote by yo ^ (7tcsTr) the total amount 
collected by R (obviously it depends on the strategies n c,7Tr)- Then the payoff of r^, Q y ^ is 

( OO 

^2 r {x t ,yt) 

t=0 

where E(-) denotes expected value and, for notational brevity, the dependence of a ’t,yt on tt c^R has been 
suppressed. 

Following the terminology of [5], we recognize that CCCR equipped with the above payoff is a positive 
stochastic game , R is Player 1 or the Maximizer and C is Player 2 or the Minimizer. These terms reflect the 
fact that R (resp. C) chooses ttr (resp. nc ) to maximize (resp. to minimize) v9 x y ^ (ttc^r)- We always have 

sup inf v9 x s (nc,n R ) < inf sup?; 9 } (ttc^r) . (5) 

5Fr ire 

The following is standard game theoretic terminology [5] . 

Definition 4.1 If we have 

inf sup (n c ,Tr R ) = sup inf vf xy) (ttcTr) (6) 

KR 7 Tr 

then we denote the common quantity of (fJf) by v^ x . and call it the value o/T^. y y 
Definition 4.2 We denote the capture time of G by CT ( G ) and define it by 

CT (G) = max f j. 

(x,j)ev 2 v ,u ' 

What is the connection between the F?, ^ for various ( x,y) G V 2 ? It is natural to assume that if at some 
stage of y -j we reach the position (x',y r ) then we can play the remaining portion of F^, y ^ as if we are just 
starting the game T^., y ,y This plausible assumption can be proved rigorously (see [20] and [5] pp.89-91]) and 
has the important consequence that, for a given G, v^ x y ^ is the same for every game T^., y ,^ (and hence it is 
correct to omit mention of a specific game in the notations v^ x y ^ (n c^r) and v^ x y ))• An additional important 
consequence is the existence of memoryless optimal strategies which are the same for all T^, y ^ games, as will be 
seen in Theorem 14.41 Before stating and proving this theorem we need some additional definitions. 

Definition 4.3 Given £ > 0, we say that the cop strategy ttq is e-optimal (for the game rP G iff 


v?x, y ) ^ s yP w G>y) ( n C^R) 


< £. 


Similarly, we say that the robber strategy 7rfj is e-optimal (for the game G iff 


vfx,y) -i ? iv< (x,y) (kC^r) 


< £. 


A 0-optimal (cop or robber) strategy is simply called optimal. 


7 






If both ttq and tt* r are optimal, then we have v*ff x y ^ = v^ x y ^ (ttq, tt r )- The main facts about the T^, y ^ games 
are summarized in the following. 

Theorem 4.4 For every graph G = (V. E ) and every ( x , y) £ V 2 the following hold. 

1. For every ( x,y ) £ V 2 , the game T^, y ^ has the value v^ x y y 

2. There exists a memoryless cop strategy ttf, which is optimal for every game T^, y ^. For every e > 0, there 
exists a memoryless robber strategy tt r which is e-optimal for every game P^ y ^. 

3. V 2 can be partitioned into the sets V\ and V 2 defined by 

Vi = [(x,y) : vf X}V) < ooj , V 2 = {(x,y) -vf x , y) = ooj . 


4. If c(G) = 1, then Vi = V 2 , i.e., v^ x < 00 for every (x,y) £ V 2 . 

Proof. Parts 1 and 2 follow immediately from the results of [6]. Part 3, the partition of V 2 into V\ and V 2 , is 
just a definition. It remains to show part 4, i.e., that c(G) = 1 => V\ = V 2 . This will also follow from [6] if we 
can show the existence of a cop strategy w# and a constant M G such that 

Vrrfl, x, y : vf xy) < M G < 00 ; (7) 

in other words, tVq guarantees finite (not necessarily optimal) capture time for every robber strategy and every 
starting position. 

The required 7r^ is the strategy used in the proof of Lemma 1X51 Indeed, recall that 
Ak = |s : s £ S°° and R is still free after the first k ■ T rounds | 


and when C 


uses 7 r 


# 

c 


and R uses any 7 tr we have 


^ p r(A fc ) 

k =1 


00 


^E 



It follows that 


V7r « : V X 0 ,V0 ifKci'KR) = E ^E r ( Xu W*)j 

00 00 / 

< k ■ T • Pr (Ak) < E k ' T • ( 1 


fc =1 


fc =1 



< 00 . 


Letting Mg = max( I>v \ £ y 2 M G , x ,y < 00 , where M G depends only on G, we see that iff, satisfies 0 and the 
proof is complete. ■ 


Remark 4.5 The theorem can be extended to the game P(E) f° r an y g ra Ph G (with any c(G)), any number 
of cops K and any initial position (x,y) (we will now have x £ V K ). If I\ > c(G), then V\ = V K+1 . Note 
that the set Vj will never be empty; for example, when I\ = 1, (x,x) belongs to Vj for any c(G) £ N (since 
v fx,x) (™c, Tffl) = 0 for any G, x , 7rc, Hr)- 


Remark 4.6 Parts 1, 2 and 3 of the theorem can also be proved immediately using the results of either [5] or 


4.2 Computation of Value and Optimal Strategies 

The value and optimal strategies of T? y ^ can be computed by value iteration , as shown by Theorem l4.7l Before 
presenting the theorem and its proof let us give its intuitive justification. 

Suppose at time t the game position is ( x,y ). As already mentioned, we can assume that the “ remainder 
game ” is r^. i.e., it can be played as a new CCCR game starting at {x,y)\ the remainder game has value 

v^ x y y Suppose further that C uses the move u and R uses the move v. The new game position is 

(x',y') = Q((x',y'),u,v), 


R receives 


r(x',y') = r (q {(x, y), u, r;)) 


units from C and, invoking memorylessness again, the remainder-game is T^., y ,^ = T~ . , and has value 

v9i, = v^,, , ,. To describe the relationship between v9 . and . , we need some new notation. 

iV ) Q((x,y),u,v) L \ x ->y) Q((x,y),u,v) 

Recall that a finite two-person zero-sum game in normal form can be specified by a single M x N matrix P 
[HJ. The game is played in a single round as follows: simultaneously the maximizing Player f chooses the row 
index m and the minimizing Player 2 chooses the column index n\ then Player 2 pays to Player 1 the amount 
A mn . It is well known that every such game has a value and many algorithms are available to compute it. We 
denote the game matrix A by the notation {A mn })4i and its value by Val . 

It seems reasonable (and can be rigorously justified) that y ^ can be considered as a single-round finite 

two-person zero-sum game as follows: when C chooses move u and R chooses move v the payoff to R is 


Q((x,y),u,v) 


r (Q ((x,y) ,u,v)J 

In other words, R receives r (q ((x, y ), u, v)j units as the payoff of the current round and v 
as the payoff of the “remainder-game” r~ ^ ^ ^ (which is assumed to be played optin 

Hence the game matrix of P^. y ^ is |r (q (( x , y), u. v)^ + 4)((x y ^ u ^ | and has value 

{r (Q ((x,y) ,u,v)j 


( 8 ) 


%(*,«),U,v) UnitS 
(which is assumed to be played optimally by both players). 
vev 


7 0>y) 


= Val 


'”q( 0 


■,y),u,v)\ 


v£V 

uGV 


(9) 


Note that © holds when x ^ y; for x = y we obviously have vf xy ) =0. 

The above is an informal argument for the connection between the values v^ x y ^. The following theorem shows 
that the argument can be made rigorous; furthermore, the theorem provides a method for computing the values, 
as well as the optimal strategies. 

Theorem 4.7 For every graph G = ( V , E) with c ( G ) = 1 and for every (x, y) G V 2 the values < vf , ( 

f l ,V) J ( a , y)gy2 

are the smallest (componentwise) positive solution of the system of optimality equations: 


= Val 


{r(Q{(x,y),u,v)) +®§ ((X)W)iU|0) } 


i?GV 

<EV 


when x ^ y, 


v (x, v ) = 0 when x = y. 

Furthermore, for n = 0,1, 2,..., define the initial conditions 


v (x y ) (0) > 0 when x ^ y and v^ x y ) (0) = 0,0 when x = y 
and, for n G N, the recursion ( value iteration ) 

4 «,») ( n + !) = Val [{ r {p y) ’ «)) + 4«x,y),u,v) ( n )} ugy 

vfx,y) (^ + 1) = 0 when x = y. 


( 10 ) 

( 11 ) 


9 


when x ^ y, 


( 12 ) 

(13) 








Then 


V (x, y) G V 2 : lim vf } ( n ) = vg } . 

Proof. This is essentially the combination of Theorems 4.4.3 and 4.4.4 from [5], but the following modifications 
are required. 

In j5j the optimality equations (1 10 fl - (1 11 jl and the recursion (fT2l) - (fT3l) are given in terms of transition prob¬ 
abilities which in our notation would be written as P (( x',y') \ (x,j/) ,u,v); this is the probability that the new 
position is (x', y') given the old position is (x, y) and the player moves are u and v. However, in CCCR transitions 
are deterministic, i.e., 


P{(x’,y’) | (x,y),u,v) 


1 when (x',y') = Q{(x,y),u,v) 
0 otherwise. 


Furthermore, once the game reaches a capture position (x, x), it will always stay in this position, which has value 
- <>• 

Taking the above in account, the optimality equations and the recursion of [5] reduce to GDD-G3D- ■ 

G K 

Remark 4.8 The modification of the theorem theorem for the game F^; y ,, with K > 1, is obvious. 

We conclude this section with some examples. We apply the value iteration m-m to several graphs and 
discuss the results. 


Example 4.9 In the first example G is a path of five nodes, as illustrated in Figure 1479 



Figure 1: A five node path. 


Let the (x,y) element of matrix V G be equal to v^ x >, the value of ^ when the cop is at node x and the 
robber at node y. Value iteration yields 




0 4 4 4 4 

1 0 3 3 3 

2 2 0 2 2 

3 3 3 0 1 

4 4 4 4 0 


Cop and robber optimal strategies can be described quite easily: the cop should always move towards the robber 
and the robber should always move away from the robbei0. Clearly in this graph the CCCR time-optimal 
strategies are the same as those for the TBCR game. 

Example 4.10 Not surprisingly, for the tree G illustrated in Figure 14.101 the optimal cop and robber strategies 
are again the same for the CCCR and TBCR games. 



Figure 2: A tree. 

5 Actually the robber has several other optimal strategies; he can also stay in place if the cop is at a distance greater than one. 
These are pure (i.e., deterministic) strategies; the robber also has mixed optimal strategies. 
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The value-iteration algorithm yields 


V c 


0 12 2 2 
3 0 3 3 3 

2 2 0 1 1 

3 3 3 0 2 

3 3 3 2 0 


Example 4.11 The next example involves the clique of three nodes, as illustrated in Figure 14.111 



Figure 3: A three node clique. 


ttG 


After eight iterations, the algorithm yields V which is (componentwise) within 10 2 of the true solution 


V L 


0 2 2 
2 0 2 
2 2 0 


The algorithm also yields the optimal strategies, which are symmetrical with respect to the cop and robber 
positions (xo,yo)- For example, when (xo,yo) = (3,1) we have 


l.J.O 


and ttq = 


O'U 


In other words, under these strategies, both cop and robber always move with equal probability to one of the 
two nodes they don’t currently occupy. It can be verified analytically that these strategies yield the previously 
displayed value matrix V G . Because of symmetry, many other optimal strategies exist for both cop and robber. 


Example 4.12 The final example involves a Gavenciak graph [Tj, as illustrated in Figure Ef.121 



Figure 4: A Gavenciak graph. 

From the results of [7; we know that the TBCR capture time of this graph is 7 (this is the minimax of the capture 
time over all initial positions). The cop is able to achieve this capture time by first maneuvering himself to node 
7 and forcing the robber into the path subgraph, and then chasing the robber all the way to node 10. In the 
CCCR game the results are similar but they require the use of randomized strategies. We do not present the 
entire V G (because of space limitations) but let us give some indicative results. For example, when the initial 
positions are (xo,yo) = (2,1), the cop cannot be certain of capturing the robber in one move (since the moves 
are simultaneous). It turns out that, by the application of randomized strategies, the optimal expected capture 
time is v§ 1 = 18.82... . However, the part of the strategies which concerns the path subgraph is, as in TBCR, 
deterministic. For inctance once the cop reaches node 8 (with the robber in either node 9 or 10) he should 
deterministically perform the transitions 8 —> 9 —> 10. Let us also note that for this ten-nodes graph, the value 
iteration algorithm required 90 iterations to get (componentwise) within 10~ 2 of the true solution. 
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5 Related Work 


While the assumption of simultaneous moves is a natural one (and is better than turn-based movement as a 
model of real world pursuit / evasion problems) it appears that CCCR has not been studied in the cops and 
robber literature. However, our analysis of time optimal CCCR strategies follows closely the corresponding study 
of time optioimal TBCR strategies presented in [8] (and expanded in [4]). Both Hahn’s algorithm in [8] and the 
recursion m of Theorem 14. 71 are value iteration algorithms. The main difference between the two is this: while 
in Hahn’s algorithm updating the value in every iteration only requires taking a minimum or a maximum, every 
value iteration of (1121) requires solving a one-round, zero-sum game (this is indicated by the Val [■] operator in 
TO- Consequently, (1121) is computationally more intensive than Hahn’s algorithm. 

As already mentioned, simultaneous moves have not been explored in the CR literature. On the other hand, 
an interesting analog can be found in the literature of reachability games mm • As we have pointed out in 
mna, TBCR is a special case of a “classical” (i.e., turn-based) reachability game. Similarly, CCCR is a special 
case of a concurrent reachability gam<f| The literature on concurrent reachability games IT] can furnish useful 
insights for the analysis of CCCR. 

All the above problems can be considered as special cases of the general stochastic game. The book [5] is an 
excellent, comprehensive and relatively recent study of the topic; it also contains many references to important 
earlier work. 


6 Concluding Remarks 

We conclude this paper by presenting questions which, in our opinion, merit further study. 

One group of questions concern the definition of cop number, which in turn depends on the definition of the 
properties of the capture event. To understand the issue, we must turn back to concurrent reachability games. 
Let A be the set of all histories of a reachability game and B C A the set of all realizations in which the target 
state is reached (in CCCR, B would be the set of all infinite histories {(xt, yt)Y^L 0 for which there is some t c 
such that Xt c = yt c )- As pointed out in pQ, the target state can be reached in at least three different senses (and 
each of these implies the next one in the list). 

1. Sure reachability: B = A. 

2. Almost sure reachability: Pr (B) = 1. 

3. Limit sure reachability: For every real e, player 1 has a strategy such that for all strategies of player 2, the 
target state is reached with probability greater than 1 — e. 


The above carry over to CCCR and can be used to define corresponding cop numbers: c sure (G), c a imostsure (G) 
and cumitsure (G). In this paper we have worked exclusively with c(G) = c a imostsure (G). Obviously 


Csure (G) ^ Caimostsure ^ Climit sure (G) 


but several additional questions can be asked. For example can the ratios 


AG) 


and 


AG) 


be 


Calmostsure(.G) CHm.it — sure(G) 

bounded by a constant? By a number depending on the size of G? How about the differences c sure (G) — 

Caimostsure ( G ) and c a / moS £ sure (G) CUmitsure (G)? 

Another group of questions concerns the CCCR variants obtained by modifying the cop’s and/or the robber’s 
behavior. 


1. For example, what is the cost of drunkenness ? In other words, what is the ratio of expected capture times 
for the previously descibed CCCR game and a variant in which the robber performs a random walk on the 
nodes of the graph? The same question has been studied for the TBCR case in [TO, 11. 

2. Similarly, what is the cost of visibility ? In this case we study the ratio of expected capture times for the 
previously descibed CCCR game and a variant in which the robber is invisible to the cop. For the TBCR 
case, this has been studied in mm 

6 This is the source of our term “concurrent CR game”. 
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